Extension of CART using multiple splits under order restrictions
نویسندگان
چکیده
CART was introduced by Breiman et al. (1984) as a classification tool. It divides the whole sample recursively in two subpopulations by finding the best possible split with respect to a optimisation criterion. This method, restricted up to date to binary splits, is extended in this paper for allowing also multiple splits. The main problem with this extension is related to the optimal number of splits and the location of the corresponding cutpoints. In order to reduce the computational effort and enhance parsimony, the reduced isotonic regression was used in order to solve this problem. The extended CART method was tested in a simulation study and was compared with the classical approach in an epidemiological study. In both studies the extended CART turned out to be a useful and reliable alternative.
منابع مشابه
Comparing ANN and CART to Model Multiple Land Use Changes: A Case Study of Sari and Ghaem-Shahr Cities in Iran
Most of the land use change modelers have used to model binary land use change rather than multiple land use changes. As a first objective of this study, we compared two well-known LUC models, called classification and regression tree (CART) and artificial neural network (ANN) from two groups of data mining tools, global parametric and local non-parametric models, to model multiple LUCs. The ca...
متن کاملExperimental Evaluation of Algorithmic Effort Estimation Models using Projects Clustering
One of the most important aspects of software project management is the estimation of cost and time required for running information system. Therefore, software managers try to carry estimation based on behavior, properties, and project restrictions. Software cost estimation refers to the process of development requirement prediction of software system. Various kinds of effort estimation patter...
متن کاملDiscussion of the paper Bayesian Treed Generalized Linear Models
In this stimulating paper, the authors have successfully exploited Markov chain Monte Carlo methods to explore the space of graphs for CART-like trees in which the terminal nodes represent generalized linear models (GLMs). Integration over the parameters of the terminal GLMs, in order to compute the marginal likelihood (probability of data given the model) for the MCMC search, is accomplished u...
متن کاملEvaluation of wheat genotypes under tillage practices: application of technique for order preference by similarity to ideal solution method
Adoption of conservative agriculture at farm level is associated with reducing the production costs and leads to crop yield stability. The aim of this study was to prioritize experimental treatments based on different criteria by applying "technique for order preference by similarity to ideal solution" (TOPSIS).A filed experiment was carried out at Zarghan research station, Fars province, Iran,...
متن کاملUsing Pairs of Data-Points to Define Splits for Decision Trees
Conventional binary classification trees such as CART either split the data using axis-aligned hyperplanes or they perform a computationally expensive search in the continuous space of hyperplanes with unrestricted orientations. We show that the limitations of the former can be overcome without resorting to the latter. For every pair of training data-points, there is one hyperplane that is orth...
متن کامل